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Fig. 1. Alan Turing (image source |Imal3l ) 



"And when it comes to mathematics, you must realize that this is the human 
mind at the extreme limit of its capacity." (H. Robbins) 

". . . so reduce the use of the brain and calculate!" (E. W. Dijkstra) 

"The fact that a brain can do it seems to suggest that the difficulties [of trying 
with a machine] may not really be so bad as they now seem." (A. Turing) 
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1 Computer Calculation 

1.1 a panorama of the status quo 

Where stands the mathematical endeavor? 

In 2012, many mathematical utilities are reaching consolidation. It is an age of 
large aggregates and large repositories of mathematics: the arXiv, Math Reviews, and 
euDML, which promises to aggregate the many European archives such as Zentralblatt 
Math and Numdam. Sage aggregates dozens of mathematically oriented computer pro- 
grams under a single Python-scripted front-end. 

Book sales in the U.S. have been dropping for the past several years. Instead, online 
sources such as Wikipedia and Math Overflow are rapidly becoming students' preferred 
math references. The Polymath blog organizes massive mathematical collaborations. 
Other blogs organize previously isolated researchers into new fields of research. The 
slow, methodical deliberations of referees in the old school are giving way; now in a 
single stroke, Tao blogs, gets feedback, and publishes. 

Machine Learning is in its ascendancy. LogAnswer and Wolfram Alpha answer our 
elementary questions about the quantitative world; Watson our Jeopardy questions. 
Google Page ranks our searches by calculating the largest eigenvalue of the largest ma- 
trix the world has ever known. Deep Blue plays our chess games. The million-dollar- 
prize-winning Pragmatic Chaos algorithm enhances our Netflix searches. The major 
proof assistants now contain tens of thousands of formal proofs that are being mined 
for hints about how to prove the next generation of theorems. 

Mathematical models and algorithms rule the quantitative world. Without applied 
mathematics, we would be bereft of Shor's factorization algorithm for quantum com- 
puters, Yang-Mills theories of strong interactions in physics, invisibility cloaks. Radon 
transforms for medical imaging, models of epidemiology, risk analysis in insurance, 
stochastic pricing models of financial derivatives, RSA encryption of sensitive data, 
Navier-Stokes modeling of fluids, and models of climate change. Without it, entire 
fields of engineering from Control Engineering to Operations Research would close 
their doors. The early icon of mathematical computing. Von Neumann, divided his final 
years between meteorology and hydrogen bomb calculations. Today, applications fuel 
the economy: in 201 1 rankings, the first five of the "10 best jobs" are math or computer 
related: software engineer, mathematician, actuary, statistician, and computer systems 
analyst IICCIIH . 

Computers have rapidly become so pervasive in mathematics that future generations 
may look back to this day as a golden dawn. A comprehensive survey is out of the 
question. It would almost be like asking for a summary of applications of symmetry to 
mathematics. Computability - like symmetry - is a wonderful structural property that 
some mathematical objects possess that makes answers flow more readily wherever it is 
found. This section gives many examples that give a composite picture of computers in 
mathematical research, showing that computers are neither the panacea that the public 
at large might imagine, nor the evil that the mathematical purist might fear I have 
deliberately selected many examples from pure mathematics, partly because of my own 
background and partly to correct the conventional wisdom that couples computers with 
applied mathematics and blackboards with pure mathematics. 



3 



1.2 Birch and Swinnerton-Dyer conjecture 

I believe that the Birch and Swinnerton-Dyer conjecture is the deepest conjecture ever 
to be formulated with the help of a computer IIBSD65I . The Clay Institute has offered a 
one-million dollar prize to anyone who settles it. 

Let E be an elliptic curve defined by an equation - + ax + b over the field of 
rational numbers. Motivated by related quantities in Siegel's work on quadratic forms. 
Birch and Swinnerton-Dyer set out to estimate the quantity 

W^pip, (1) 

where A^^ is the number of rational points on E modulo p, and the product extends 
over primes p < P PBir021. Performing experiments on the EDS AC II computer at 
the Computer laboratory at Cambridge University during the years 1958-1962, they 
observed that as P increases, the products ([T]i grow asymptotically in P as 

c(E) log' P, 

for some constant c, where r is the Mordell-Weil rank of E; that is, the maximum 
number of independent points of infinite order in the group ^(Q) of rational points. 
Following the suggestions of Cassels and Davenport, they reformulated this numerical 
asymptotic law in terms of the zeta function L{E, s) of the elliptic curve. Thanks to the 
work of Wiles and subsequent extensions of that work, it is known that L{E, s) is an 
entire function of the complex variable s. The Birch and Swinnerton-Dyer conjecture 
asserts that the rank r of an elliptic curve over Q is equal to the order of the zero of 
L(E,s) at i = 1. 

A major (computer-free) recent theorem establishes that the Birch and Swinnerton- 
Dyer conjecture holds for a positive proportion of all elliptic curves over Q BBSIOL 
This result, although truly spectacular, is mildly misleading in the sense that the elliptic 
curves of high rank rarely occur but pose the greatest difficulties. 



1.3 Sato-Tate 

The Sato-Tate conjecture is another major conjecture about elliptic curves that was 
discovered by computer If E is an elliptic curve with rational coefficients 

— x^ + ax + b, 

then the number of solutions modulo a prime number p (including the point at infinity) 
has the form 

I + p - 2^/pcos0p. 

for some real number < 9p < n. In 1962, Sato, Nagashima, and Namba made cal- 
culations of 9p on a Hitachi HIPAC 103 computer to understand how these numbers 
are distributed as p varies for a fixed elliptic curve E |Schl. By the spring of 1963, the 
evidence suggested sin^ as a good fit of the data (Figure Ej). That is, if P(n) is the set 
of the first n primes, and / : [0, ;r] ^ R is any smooth test function, then for large n, 

- y f(Sn) tends to - m)sm^ed6. 
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Fig. 2. Data leading to the Sato-Tate conjecture (image source tSNJ ) 

The Sato-Tate conjecture (1963) predicts that this same distribution is obtained, no 
matter the elliptic curve, provided the curve does not have complex multiplication. Tate, 
who arrived at the conjecture independently, did so without computer calculations. 

Serre interpreted Sato-Tate as a generalization of Dirichlet's theorem on primes in 
arithmetic progression, and gave a proof strategy of generalizing the analytic proper- 
ties of L-functions used in the proof of Dirichlet's theorem IISer68l . Indeed, a complete 
proof of Sato-Tate conjecture has now been found and is based on extremely deep an- 
alytic properties of L-functions BCar07 l. The proof of the Sato-Tate conjecture and its 
generaUzations has been one of the most significant recent advances in number theory. 

1.4 transient uses of computers 

It has become common for problems in mathematics to be first verified by computer and 
later confirmed without them. Some examples are the construction of sporadic groups, 
counterexamples to a conjecture of Euler, the proof of the Catalan conjecture, and the 
discovery of a formula for the binary digits of n. 

Perhaps the best known example is the construction of sporadic groups as part of 
the monumental classification of finite simple groups. The sporadic groups are the 26 
finite simple groups that do not fall into natural infinite families. For example, Lyons 
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(1972) predicted the existence of a sporadic group of order 

2** -3^ ■5'' -7 -11 -31 -37 -67. 

In 1973, Sims proved the existence of this group in a long unpublished manuscript that 
relied on many specialized computer programs. By 1999 , the calculations had become 
standardized in group theory packages, such as GAP and Magma |HS9^. Eventually, 
computer-free existence and uniqueness proofs were found IM C02J . LAS92i 

Another problem in finite group theory with a computational slant is the inverse 
Galois problem: is every subgroup of the symmetric group S„ the Galois group of a 
polynomial of degree n with rational coefficients? In the 1980s Malle and Matzat used 
computers to realize many groups as Galois groups BMM99I , but with an infinite list of 
finite groups to choose from, non-computational ideas have been more fruitful, such as 
Hilbert irreducibility, rigidity, and automorphic representations IIKLS08I . 

Euler conjectured (1769) that a fourth power cannot be the sum of three positive 
fourth powers, that a fifth power cannot be the sum of four positive fifth powers, and 
so forth. In 1966, a computer search [LP661 on a CDC 6600 mainframe uncovered a 
counterexample 

27^ -h84^ + 110^ + 133^ = 144\ 

which can be checked by hand (1 dare you). The two-sentence announcement of this 
counterexample qualifies as one of the shortest mathematical publications of all times. 
Twenty years later, a more subtle computer search gave another counterexample llElk88ll : 

2682440"* + 15365639"^ + 18796760"* = 20615673^ 
The Catalan conjecture (1844) asserts that the only solution to the equation 

x"'-y" = 1, 

in positive integers x, y, m, n with exponents m, n greater than 1 is the obvious 

3^-2^ = 1. 

That is, 8 and 9 are the only consecutive positive perfect powers. By the late 1970s, 
Baker's methods in diophantine analysis had reduced the problem to an astronomically 
large and hopelessly infeasible finite computer search. Mihailescu's proof (2002) of the 
Catalan conjecture made light use of computers (a one-minute calculation), and later 
the computer calculations were entirely eliminated MMih04l . IIMet03l . 

Bailey, Borwein, and Plouffe found an algorithm for calculating the nth binary digit 
of 71 directly: it jumps straight to the nth digit without first calculating any of the earlier 
digits. They understood that to design such an algorithm, they would need an infinite 
series for n in which powers of 2 controlled the denominators. They did not know of any 
such formula, and made a computer search (using the PSLQ lattice reduction algorithm) 
for any series of the desired form. Their search unearthed a numerical identity 
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8n-t-l 8n + 4 8n + 5 8nH-6/\16 



=0 

which was then rigorously proved and used to implement their binary-digits algorithm. 



6 



1.5 Rogers-Ramanujan identities 

The famous Rogers-Ramanujan identities 

can now be proved by an almost entirely mechanical procedure from Jacobi's triple 
product identity and the ^-WZ algorithm of Wilf and Zeilberger that checks identi- 
ties of ^-hypergeometric finite sums P Pau94l . Knuth's foreword to a book on the WZ 
method opens, "Science is what we understand well enough to explain to a computer. 
Art is everything else we do." Through the WZ method, many summation identities 
have become a science I1PWZ96I . 



1.6 packing tetrahedra 

Aristotle erroneously believed that regular tetrahedra tile space: "It is agreed that there 
are only three plane figures which can fill a space, the triangle, the square, and the 
hexagon, and only two solids, the pyramid and the cube" |AriBC|. However, centuries 
later, when the dihedral angle of the regular tetrahedron was calculated: 

arccos(l/3) ~ 1.23 < 1.25664 ^ 27r/5, 

it was realized that a small gap is left when five regular tetrahedra are grouped around a 
common edge (Figure|3]l. In 1900, in his famous list of problems, Hilbert asked "How 
can one arrange most densely in space an infinite number of equal solids of given form, 
e.g., spheres with given radii or regular tetrahedra . . . ?" 




Fig. 3. Regular tetrahedra fail to tile space (image source |Doyll| ). 



Aristotle notwithstanding, until recently, no arrangements of regular tetrahedra with 
high density were known to exist. In 2000, Betke and Henk developed an efficient com- 
puter algorithm to find the densest lattice packing of a general convex body |BH00l. 
This opened the door to experimentation |CT06 1. For example, the algorithm can deter- 
mine the best lattice packing of the convex hull of the cluster of tetrahedra in Figure [3j 
In rapid succession came new record-breaking arrangements of tetrahedra, culminating 
in what is now conjectured to be the best possible IICEGIOI . (See Figure |4]) Although 
Chen had the panache to hand out Dungeons and Dragons tetrahedral dice to the audi- 
ence for a hands-on modeling session during her thesis defense, the best arrangement 
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was found using Monte Carlo experiments. In the numerical simulations, a finite num- 
ber of tetrahedra are randomly placed in a box of variable shape. The tetrahedra are 
jiggled as the box slowly shrinks until no further improvement is possible. Now that a 
precise conjecture has been formulated, the hardest part still remains: to give a proof. 




Fig. 4. The best packing of tetrahedra is believed to be the Chen-Engel-Glotzer arrangement with 
density 4000/4671 a 0.856 (image source ICEGlOn . 



1.7 the Kepler conjecture 

Hilbert's 18th problem asks to find dense packings of both spheres and regular tetra- 
hedra. The problem of determining the best sphere packing in three dimensions is the 
Kepler conjecture. Kepler was led to the idea of density as an organizing principle in 
nature by observing the tightly packed seeds in a pomegranate. Reflecting on the hexag- 
onal symmetry of snowflakes and honeycombs, by capping each honeycomb cell with 
a lid of the same shape as the base of the cell, he constructed a closed twelve-sided cell 
that tiles space. Kepler observed that the familiar pyramidal cannonball arrangement is 
obtained when a sphere is placed in each capped honeycomb cell (Figure [5]l. This he 
believed to be the densest packing. 

L. Fejes Toth proposed a strategy to prove Kepler's conjecture in the 1950s, and later 
he suggested that computers might be used. The proof, finally obtained by Ferguson and 
me in 1998, is one of the most difficult nonlinear optimization problems ever rigorously 
solved by computer |Hal05b|. The computers calculations originally took about 2000 
hours to run on Sparc workstations. Recent simplifications in the proof have reduced 
the runtime to about 20 hours and have reduced the amount of customized code by a 
factor of more than 10. 

1.8 the four-color theorem 

The four-color theorem is the most celebrated computer proof in the history of mathe- 
matics. The problem asserts that it is possible to color the countries of any map with at 
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Fig. 5. An optimal sphere packing is obtained by placing one sphere in each three-dimensional 
honeycomb cell (image source [RhDl II ). 



most four colors in such a way that contiguous countries receive different colors. The 
proof of this theorem required about 1200 hours on an IBM 370-168 in 1976. So much 
has been written about Appel and Haken's computer solution to this problem that it is 
pointless to repeat it here fAHK7T|. Let it suffice to cite a popular account |Wil021, a 
sociological perspective [MacOlJ, the second generation proof |,RSST97J . and the cul- 
minating formal verification MGonOSII . 

1.9 projective planes 

A finite projective plane of order n > 1 is defined to be a set of H- n -i- 1 lines and 
rP' + n + I points with the following properties: 

1. Every line contains n + 1 points; 

2. Every point is on n H- 1 lines; 

3. Every two distinct lines have exactly one point of intersection; 

4. Every two distinct points lie on exactly one line. 




Fig. 6. The Fano plane is a finite projective plane of order 2. 
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The definition is an abstraction of properties that evidently hold for P^(F^), the pro- 
jective plane over a finite field F^^, with q - n, for any prime power q. In particular, a 
finite projective plane exists whenever n is a positive power of a prime number (Fig- 
ure |6]l. 

The conjecture is that every finite projective plane of order n > 1 is a prime power. 
The smallest integers n > 1 that are not prime powers are 



6, 10, 12, 14, 15, ... 



The brute force approach to this conjecture is to eliminate each of these possibilities 
in turn. The case n - 6 was settled in 1938. Building on a number of theoretical ad- 
vances fMSTTSJ, Lam eliminated the case n - 10 in 1989, in one of the most difficult 
computer proofs in history IILTS89I . This calculation was executed over a period of 
years on multiple machines and eventually totaled about 2000 hours of Cray- 1 A time. 

Unlike the computer proof of the four-color theorem, the projective plane proof has 
never received independent verification. Because of the possibilities of programming 
errors and soft errors (see Section 3.5 i. Lam is unwilling to call his result a proof. He 
writes, "From personal experience, it is extremely easy to make programming mistakes. 
We have taken many precautions, . . . Yet, I want to emphasize that this is only an ex- 
perimental result and it desperately needs an independent verification, or better still, a 
theoretical explanation" ||Lam9 1 1 . 

Recent speculation at Math Overflow holds that the next case, n - 12, remains 
solidly out of computational reach BHorlOl . 



1.10 hyperbolic manifolds 

Computers have helped to resolve a number of open conjectures about hyperbolic man- 
ifolds (defined as complete Riemannian manifolds with constant negative sectional cur- 
vature -1), including the proof that the space of hyperbolic metrics on a closed hyper- 
boUc 3-manifold is contractible 0GMTO3I . BGablOI . 



1.11 chaos theory and strange attractors 

The theory of chaos has been one of the great success stories of twentieth century math- 
ematics and science. Turing expressed the notion of chaos with these words, "quite 
small errors in the initial conditions can have an overwhelming eff'ect at a later time. 
The displacement of a single electron by a billionth of a centimetre at one moment 
might make the difference between a man being killed by an avalanche a year later, 
or escaping" [ Tur50l . Later, the metaphor became a butterfly that stirs up a tornado in 
Texas by flapping its wings in Brazil. 

Thirteen years later, Lorenz encountered chaos as he ran weather simulations on a 
Royal McBee LGP-30 computer iLor63l . When he reran an earlier numerical solution 
with what he thought to be identical initial data, he obtained wildly different results. 

' For early history, see fWol02' p. 971]. Turing vainly hoped that digital computers might be 
insulated from the effects of chaos. 
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He eventually traced the divergent results to a slight discrepancy in initial conditions 
caused by rounding in the printout. The Lorenz oscillator is the simplified form of 
Lorenz's original ordinary differential equations. 

A set A is attracting if it has a neighborhood U such that 

/>() 

where f,{x) is the solution of the dynamical system (in present case the Lorenz oscil- 
lator) at time t with initial condition x. That is, U flows towards the attracting set A. 
Simulations have discovered attracting sets with strange properties such as non-integral 
Hausdorff dimension and the tendency for a small slab of volume to quickly spread 
throughout the attractor. 




Fig. 7. The Lorenz oscillator gives one of the most famous images of mathematics, a strange 
attractor m dynamical systems (image source |Agal3| ). 

Lorenz conjectured in 1963 that his oscillator has a strange attractor (Figure [7|i. In 
1982, the Lax report cited soli ton theory and strange attractors as two prime examples of 
the "discovery of new phenomena through numerical experimentation," and calls such 
discovery perhaps the most "significant application of scientific computing" IILax82ll . 
Smale, in his list of 18 "Mathematical Problems for the Next Century" made the four- 
teenth problem to present a rigorous proof that the dynamics of the Lorenz oscillator is 
a strange attractor IISma98l with various additional properties that make it a "geometric 
Lorenz attractor." 

Tucker has solved Smale's fourteenth problem by computer IITuc02l USteOOI . One 
particularly noteworthy aspect of this work is that chaotic systems, by their very na- 
ture, pose particular hardships for rigorous computer analysis. Nevertheless, Tucker 
implemented the classical Euler method for solving ordinary differential equations with 
particular care, using interval arithmetic to give mathematically rigorous error bounds. 
Tucker has been awarded numerous prizes for this work, including the Moore Prize 
(2002) and the EMS Prize (2004). 
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Smale's list in general envisions a coming century in which computer science, es- 
pecially computational complexity, plays a much larger role than during the past cen- 
tury. He finishes the list with the open-ended philosophical problem that echoes Turing: 
"What are the limits of intelligence, both artificial and human?" 

1.12 4/3 

Mandelbrot's conjectures in fractal geometry have resulted in two Fields Medals. Here 
he describes the discovery of the 4/3-conjecture made in |Man82|| . "The notion that 
these conjectures might have been reached by pure thought - with no picture - is sim- 
ply inconceivable I had my programmer draw a very big sample [Brownian] motion 

and proceeded to play with it." He goes on to describe computer experiments that led 
him to enclose the Brownian motion into black clusters that looked to him like islands 
with jagged coastlines (Figure [8]l. "[I]nstantly, my long previous experience with the 
coastlines of actual islands on Earth came handy and made me suspect that the bound- 
ary of Brownian motion has a fractal dimension equal to 4/3" IIMan04ll . 

This conjecture, which Mandelbrot's trained eye spotted in an instant, took 18 years 
to prove IILSWOll . 




Fig. 8. A simulation of planar Brownian motion. Mandelbrot used "visual inspection supported 
by computer experiments" to formulate deep conjectures in fractal geometry (image generated 
from source code at ILSWOll ^. 



1.13 sphere eversion visualization 

Smale (1958) proved that it is possible to turn a sphere inside out without introducing 
any creasesj^ For a long time, this paradoxical result defied the intuition of experts. 
R. Bott, who had been Smale's graduate advisor, refused to believe it at first. Levy 
writes that trying to visualize Smale's mathematical argument "is akin to describing 

^ I am fond of this example, because The Scientific American article IPhi66l about this theorem 
was my first exposure to "real mathematics" as a child. 
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what happens to the ingredients of a souffle in minute detail, down to the molecular 
chemistry, and expecting someone who has never seen a souffle to follow this 'recipe' 
in preparing the dish" ||Lev95l . 




Fig. 9. Computer-generated stages of a sphere eversion (image source |Oplll| ). 



It is better to see and taste a souffle first. The computer videos of this theorem 
are spectacular. Watch them on YouTube! As we watch the sphere turn inside out, our 
intuition grows. The computer calculations behind the animations of the first video (the 
Optiverse) start with a sphere, half inverted and half right-side out fSFLl. From halfway 
position, the path of steepest descent of an energy functional is used to calculate the 
unfolding in both directions to the round spheres, with one fully inverted (Figure|9]l. The 
second video is based on Thurston's "corrugations" fLMM941 . As the name suggests, 
this sphere eversion has undulating ruffles that dance like a jellyfish, but avoids sharp 
creases. Through computers, understanding. 

Speaking of Thurston, he contrasts "our amazingly rich abilities to absorb geometric 
information and the weakness of our innate abilities to convey spatial ideas We ef- 
fortlessly look at a two-dimensional picture and reconstruct a three-dimensional scene, 
but we can hardly draw them accurately" ilPitl II . As more and more mathematics mi- 
grates to the computer, there is a danger that geometrical intuition becomes buried under 
a logical symbolism. 
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1.14 minimal surface visualization 

Weber and Wolf HWWI H report that the use of computer visualization has become 
"commonplace" in minimal surface research, "a conversation between visual aspects 
of the minimal surfaces and advancing theory, each supporting the other" This started 
when computer illustrations of the Costa surface (a particular minimal surface. Fig- 
ure 10 1 in the 1980s revealed dihedral symmetries of the surface that were not seen 
directly from its defining equations. The observation of symmetry turned out to be the 
key to the proof that the Costa surface is an embedding. The symmetries further led to 
a conjecture and then proof of the existence of other minimal surfaces of higher genus 
with similar dihedral symmetries. As Hoffman wrote about his discoveries, "The im- 
ages produced along the way were the objects that we used to make discoveries. They 
are an integral part of the process of doing mathematics, not just a way to convey a 
discovery made without their use" IIHof87L 




Fig. 10. The Costa surface launched an era of computer exploration in minimal surface theory 
(image source LSanl2J ). 



1.15 double bubble conjecture 

Closely related to minimal surfaces are surfaces of constant mean curvature. The mean 
curvature of a minimal surface is zero; surfaces whose mean curvature is constant are 
a slight generalization. They arise as surfaces that are minimal subject to the constraint 
that they enclose a region of fixed volume. Soap bubble films are surfaces of constant 
mean curvature. 

The isoperimetric inequality asserts that the sphere minimizes the surface area among 
all surfaces that enclose a region of fixed volume. The double bubble problem is the 
generalization of the isoperimetric inequality to two enclosed volumes. What is the sur- 
face minimizing way to enclose two separate regions of fixed volume? In the nineteenth 
century. Boys fBoy901 and Plateau observed experimentally that the answer should be 
two partial spherical bubbles joined along a shared flat disk (Figure [TT]i. The size of the 
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shared disk is determined by the condition that angles should be 120° where the three 
surfaces meet. This is the double bubble conjecture. 

The first case of double bubble conjecture to be established was that of two equal 
volumes |iHHS95 j . The proof was a combination of conventional analysis and computer 
proof. Conventional analysis (geometric measure theory) established the existence of a 
minimizer and reduced the possibilities to a small number of figures of revolution, and 
computers were to analyze each of the cases, showing in each case by interval analysis 
either that the case was not a local minimizer or that its area was strictly larger than 
the double bubble. Later theorems proved the double bubble conjecture in the general 
unequal volume case without the use of computers I HMRROOI . 




Fig. 11. The optimality of a double bubble was first established by computer, using interval anal- 
ysis (image source |[Tsil3j). 



The natural extension of the double bubble conjecture from two bubbles to an in- 
finite bubbly foam is the Kelvin problem. The problem asks for the surface area mini- 
mizing partition of Euclidean space into cells of equal volume. Kelvin's conjecture - a 
tiling by slight perturbations of truncated octahedra - remained the best known partition 
until a counterexample was constructed by two physicists, Phelan and Weaire in 1993 
(Figure 12 1. The counterexample exists not as a physical model, nor as an exact mathe- 
matical formula, but only as an image generated from a triangular mesh in the Surface 
Evolver computer program. By default, the counterexample has become the new con- 
jectural answer to the Kelvin problem, which I fully expect to be proved someday by 
computer. 



1.16 kissing numbers 

In the plane, at most six pennies can be arranged in a hexagon so that they all touch one 
more penny placed at the center of the hexagon (Figure 13 1. Odlyzko and Sloane, solved 
the corresponding problem in dimension 8: at most 240 nonoverlapping congruent balls 
can be arranged so that they all touch one more at the center 

Up to rotation, a urrique arrangement of 240 exists. To the cognoscenti, the proof of 
this fact is expressed as one-line certificate: 



(f-^)f'(f+^)'(f+l). 
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Fig. 12. The Phelan-Weaire foam, giving the best known partition of Euclidean space into cells 
of equal volume, was constructed with Surface Evolver software. This foam inspired the bubble 
design of the Water Cube building in the 2008 Beijing Olympics (image source IPWlllI ). 



Fig. 13. In two dimensions, the kissing number is 6. In eight dimensions, the answer is 240. The 
proof certificate was found by linear programming. 



(For an explanation of the certificates, see IIPZ04I .) The certificate was produced by a 
linear programming computer search, but once the certificate is in hand, the proof is 
computer-free. 

As explained above, six is the kissing number in two dimensions, 240 is the kissing 
number in eight dimensions. In three dimensions, the kissing number is 12. This three- 
dimensional problem goes back to a discussion between Newton and Gregory in 1694, 
but was not settled until the 1950s. A recent computer proof makes an exhaustive search 
through nearly 100 million combinatorial possibilities to determine exactly how much 
the twelve spheres must shrink to accommodate a thirteenth |MT10|. Bachoc and Val- 
lentin were recently awarded the SIAG/Optimization prize for their use of semi -definite 
programming algorithms to establish new proofs of the kissing number in dimensions 
3, 4, 8 and new bounds on the kissing number in various other dimensions I.BV08J . 
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1.17 digression on Eg 

It is no coincidence that the calculation of Odlyzko and Sloane works in dimension 8. 
Wonderful things happen in eight dimensional space and again in 24 dimensions. 

Having mentioned the 240 balls in eight dimensions, 1 cannot resist mentioning 
some further computer proofs. The centers of the 240 balls are vectors whose integral 



linear combinations generate a lattice in ]R°, known as the Eg lattice (Figure 14 1. 

There is a packing of congruent balls in eight dimensions that is obtained by center- 
ing one ball at each vector in the Eg lattice, making the balls as large as possible without 
overlap. Everyone believes that this packing in eight dimensions is the densest possible, 
but this fact currently defies proof. If the center of the balls are the points of a lattice, 
then the packing is called a lattice packing. Cohn and Kumar have a beautiful computer 
assisted proof that the fig packing is the densest of all lattice packings in (and the 
corresponding result in dimension 24 for the Leech lattice). The proof is based on the 
Poisson summation formula. Pfender and Ziegler's account of this computer-assisted 
proof won the Chauvenet Prize of the MAA for writing IIPZ04I . 



Fig. 14. The Eg lattice is generated by eight vectors in whose mutual angles are 120° or 90° 
depending on whether the corresponding dots are joined by a segment are not. 



The 240 vectors that generate the lattice are the roots of a 240-1-8 dimensional Lie 
group (also called Eg); that is, a differentiable manifold that has the analytic structure of 
a group. All simple Lie groups were classified in the nineteenth centuryj^They fall into 
infinite families named alphabetically, A„, B„, C„, D„, with 5 more exceptional cases 
that do not fall into infinite families E(„ E-j /sg, F4, G2. The exceptional Lie groujj^of 
highest dimension is E%. 

The long-term Atlas Project aims to use computers to determine all unitary repre- 
sentations of real reductive Lie groups HAtll . The 19-member team focused on Eg first, 
because everyone respects the formidable E%. By 2007, a computer had completed the 
character table of Eg. Since there are infinitely many irreducible characters and each 
character is an analytic function on (a dense open subset of) the group, it is not clear 
without much further explanation what it might even mean for a computer to output 
the full character table as a 60 gigabyte file lAdallJ . What is significant about this 
work is that it brings the computer to bear on some abstract parts of mathematics that 
have been traditionally largely beyond the reach of concrete computational description, 

^ I describe the families over C. Each complex Lie group has a finite number of further real 
forms. 

For decades, Eg has stood for the ultimate in speculative physics, whether in heterotic string 
theory or a "theory of everything." Last year, Eg took a turn toward the real world, when Eg 
calculations predicted neutron scattering experiments with a cobalt niobate magnet IBGl II . 
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including infinite dimensional representations of Lie groups, intersection cohomology 
and perverse sheaves. Vogan's account of this computational project was awarded the 
201 1 Conant Prize of the AMS [VogOTl. 

While on the topic of computation and representation theory, 1 cannot resist a di- 
gression into the P versus NP problem, the most fundamental unsolved problem in 
mathematics. In my opinion, attempts to settle P versus NP from the axioms of ZFC are 
ultimately as ill-fated as Hilbert's program in the foundations of math (which nonethe- 
less spurred valuable partial results such as the decision procedures of Presburger and 
Tarski), but if 1 were to place faith anywhere, it would be in Mulmuley's program in 
geometric complexity theory. The program invokes geometric invariant theory and rep- 
resentation theoretic invariants to tease apart complexity classes: if the irreducible con- 
stituents of modules canonically associated with two complexity classes are different, 
then the two complexity classes are distinct. In this approach, the determinant and per- 
manent of a matrix are chosen as the paradigms of what is easy and hard to compute, 
opening up complexity theory to a rich algebro-geometric structure IMull II . l|For09l . 

1.18 future computer proofs 

Certain problems are natural candidates for computer proof: the Kelvin problem by 
the enumeration of the combinatorial topology of possible counterexamples; the search 
for a counterexample to the two-dimensional Jacobian conjecture through the minimal 
model program |Bor09J; resolution of singularities in positive characteristic through 
an automated search for numerical quantities that decrease under suitable blowup; ex- 
istence of a projective plane of order 12 by constraint satisfaction programming; the 
optimality proof of the best known packing of tetrahedra in three dimensions LCEGIOI ; 
Steiner's isoperimetric conjecture (1841) for the icosahedron IISte41ll ; and the Reinhardt 
conjecture through nonlinear optimization [Hal 111. But proceed with caution! Check- 
ing on our zeal for brute computation, computer-generated patterns can sometimes fail 
miserably. For example, the sequence: 



2 




2n 


2Un _ 1 







n = 1,2,3,... 



starts out as the zero sequence, but remarkably first gives a nonzero value when n 
reaches 777,451,915,729,368 and then again when n = 140,894,092,055,857,794. 
See nSta07L 

At the close of this first section, we confess that a survey of mathematics in the 
age of the Turing machine is a reckless undertaking, particularly if it almost completely 
neglects software products and essential mathematical algorithms - the Euclidean al- 
gorithm, Newton's method, Gaussian elimination, fast Fourier transform, simplex algo- 
rithm, sorting, Schonhage-Strassen, and many more. A starting point for the exploration 
of mathematical software is KNOPPIX/Math, a bootable DVD with over a hundred free 
mathematical software products (FigurefTS) |Ham08 1. Sage alone has involved over 200 
developers and includes dozens of other packages, providing an open-source Python 
scripted alternative to computer algebra systems such as Maple and Mathematica. 
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Active-DVl, AUCT^, T^macs, Kile, WhizzyTEX 


computer algebra 


Axiom, CoCoA4, GAP, Macaulay2, Maxima, 
PARl/GP, Risa/Asir, Sage, Singular, Yacas 


numerical calc 


Octave, Scilab, FreeFem++, Yorick 


visualization 


3D-XplorMath-J, Dynagraph, GANG, Geomview, 
gnuplot, Java View, K3DSurf 


geometry 


C.a.R, Dr.Geo, GeoGebra, GEONExT, KidsCindy, KSEG 


programming 


cusp Eclipse, FASM, Gauche, GCC, Haskell, Lisp 
Prolog, Guile, Lazarus, NASM, Objective Caml, 
Perl, Python, Ruby, Squeak 



Fig. 15. Some free mathematical programs on the Knoppix/Math DVD IHamOBI . 



2 Computer Proof 

Proof assistants represent the best effort of logicians, computer scientists, and mathe- 
maticians to obtain complete mathematical rigor by computer This section gives a brief 
introduction to proof assistants and describes various recent projects that use them. 

The first section described various computer calculations in math, and this section 
turns to computer reasoning. I have never been able to get used to it being the math- 
ematicians who use computers for calculation and the computers scientists who use 
computers for proofs ! 



2.1 design of proof assistants 

A formal proof is a proof that has been checked at the level of the primitive rules of 
inference all the way back to the fundamental axioms of mathematics. The number of 
primitive inferences is generally so large that it is quite hopeless to construct a formal 
proof by hand of anything but theorems of the most trivial nature. McCarthy and de 
Bruijn suggested that we program computers to generate formal proofs from high-level 
descriptions of the proof. This suggestion has led to the development of proof assistants. 

A proof assistant is an interactive computer program that enables a user to generate 
a formal proof from its high-level description. Some examples of theorems that have 



been formally verified by proof assistants appear in Figure 16 The computer code that 
implements a proof assistant lists the fundamental axioms of mathematics and gives 
procedures that implement each of the rules of logical inference. Within this general 
framework, there are enormous variations from one proof assistant to the next. The 
feature table in Figure[T6]is reproduced from |Wie06|. The columns list diff'erent proof 
assistants, HOL, Mizar, etc. 

Since it is the one that 1 am most familiar with, my discussion will focus largely 
on a particular proof assistant, HOL Light, which belongs to the HOL family of proof 
assistants. HOL is an acronym for Higher-Order Logic, which is the underlying logic of 
these proof assistants. A fascinating account of the history of HOL appears in BGorOOL 
In 1972, R. Milner developed a proof-checking program based on a deductive system 
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LCF (for Logic of Computable Functions) that had been designed by Dana Scott a few 
years earlier. A long series of innovations (such as goal-directed proofs and tactics, 
the ML language, enforcing proof integrity through the type system, conversions and 
theorem continuations, rewriting with discrimination nets, and higher-order features) 
have led from LCF to HOL. 



Year 


Theorem 


Proof System 


Formalizer 


Traditional Proof 


1986 


First Incompleteness 


Boyer-Moore 


Shankar 


Godel 


1990 


Quadratic Reciprocity 


Boyer-Moore 


Russinoff 


Eisenstein 


1996 


Fundamental - of Calculus 


HOL Light 


Harrison 


Henstock 


2000 


Fundamental - of Algebra 


Mizar 


Milewski 


Brynski 


2000 


Fundamental - of Algebra 


Coq 


Geuvers et al. 


Kneser 


2004 


Four Color 


Coq 


Gonthier 


Robertson et al. 


2004 


Prime Number 


Isabelle 


Avigad et al. 


Selberg-Erdos 


2005 


Jordan Curve 


HOL Light 


Hales 


Thomassen 


2005 


Brouwer Fixed Point 


HOL Light 


Harrison 


Kuhn 


2006 


Fly speck I 


Isabelle 


Bauer-Nipkow 


Hales 


2007 


Cauchy Residue 


HOL Light 


Harrison 


classical 


2008 


Prime Number 


HOL Light 


Harrison 


analytic proof 


2012 


Odd Order Theorem 


Coq 


Gonthier 


Feit-Thompson 



Fig. 16. Examples of Formal Proofs, adapted from OHalOSI . 



Without going into full detail, 1 will make a few comments about what some of the 
features mean. Different systems can be commended in different ways: HOL Light for 
its small trustworthy kernel, Coq for its powerful type system, Mizar for its extensive 
libraries, and Isabelle/HOL for its support and usability. 



small proof kernel. If a proof assistant is used to check the correctness of proofs, who 
checks the correctness of the proof assistant itself? De Bruijn proposed that the proofs 
of a proof assistant should be capable of being checked by a short piece of computer 
code - something short enough to be checked by hand. For example, the kernel of 
the proof assistant HOL Light is just 430 lines of very readable computer code. The 
architecture of the system is such that if these 430 lines are bug free then it is incapabl^ 
of generating a theorem that hasn't been properly proved. 



automating calculations. Mathematical argument involves both calculation and proof. 
The foundations of logic often specify in detail what constitutes a mathematical proof (a 



[ exaggerate. Sectionjsjgoes into detail about trust in computers. 
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proof assistant 


HOL 

Mizar 

PVS 

Coq 

Otter/Ivy 


Isabelle/Isar 

Alfa/Agda 

ACL2 

PhoX 

IMPS 


Metamath 
Theorema 
Lego 
Nuprl 


B method 
Minilog 


small proof kernel ('proof objects') 


+ - - + 


+ 


+ 


+ - + - 


+ - + 




+ 


- + 


calculations can be proved automatically 


+ - + + 


+ 


+ 


- + + + 


- + + 


+ 


+ 


+ + 


extensible/programmable by the user 


+ - + + 




+ 






+ 


+ 


- + 


powcilUl aULOlllaLlOll 


+ - + - 


+ 


+ 


- + - + 


- + - 




+ 


+ - 


readable proof input files 


- + - - 




+ 


- + - - 


- + - 








constructive logic supported 


- - - + 




+ 


+ - - - 


+ - + 


+ 




- + 


logical framework 






+ 




+ - - 








typed 


+ + + + 




+ 


+ - + + 


- - + 


+ 


+ 


- + 


decidable types 


+ + - + 




+ 


+ - + + 


- - + 




+ 


- + 


dependent types 


- + + + 






+ - - - 


- - + 


+ 






based on higher order logic 


+ - + + 




+ 


+ - + + 


- + + 


+ 


+ 




based on ZFC set theory 


- + - - 




+ 




+ - - 






+ - 


large mathematical standard library 


+ + + + 




+ 


- - - + 




+ 







Fig. 17. Features of proof assistants IWie06l . The table is published by permission from Springer 
Science Business Media B.V. 



sequence of logical inferences from the axioms), but downgrade calculation to second- 
class status, requiring every single calculation to undergo a cumbersome translation 
into logic. Some proof assistants allow reflection (sometimes implausibly attributed to 
Poincare), which admits as proof the output from a verified algorithm (bypassing the 
expansive translation into logic of each separate execution of the algorithm) LPoi52l 
p. 41. llBi?07l . 

constructive logic. The law of excluded middle <^ V -i0 is accepted in classical logic, 
but rejected in constructive logic. A proof assistant may be constructive or classical. 
A box (A Mathematical Gem) shows how HOL Light becomes classical through the 
introduction of an axiom of choice. 
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A Mathematical Gem - Proving the Excluded Middle 

The logic of HOL Light is intuitionistic until the axiom of choice is introduced and 
classical afterwards. By a result of Diononescu IIBee85l . choice and extensionality 
imply the law of excluded middle: 

<p V -^<p. 

The proof is such a gem that 1 have chosen to include it as the only complete proof 
in this survey article. Consider the two sets of booleans 

Py-{x\{x- false) V ((x = tme) A 4>)] and 
P2-{x\(x- true) V ((x = false) A cp))]. 

The sets are evidently nonempty, because false e Pi and true e Pj- By choice, we 
may pick x\ e Pi and X2 e P2; and by the definition of Pi and P2: 

{x\ - false) V ix\ - true), {x2 - false) V (x2 - true). 

We may break the proof of the excluded middle into four cases, depending on the 
two possible truth values of each of xi and X2. 

Cases {x\,X2) - (true, true), {x\,X2) - (true, false): By the definition of Pi, if xi - 
true, then 0, so (/> V -^(p. 

Case {x\,X2) - (false, false): Similarly, by the definition of P2, if X2 - false, then 0, 
so also (p V -10. 

Case {x\ , X2) - (false, true): If 0, then Pi = P2, and the choices x\ and X2 reduce to 
a single choice x\ - X2, which contradicts (xi,X2) = (false, true). Hence <p implies 
false; which by the definition of negation gives -i0, so also (p V ^(p. Q.E.D. 
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logical framework. Many different systems of logic arise in computer science. In some 
proof assistants the logic is fixed. Other proof assistants are more flexible, allowing 
different logics to be plugged in and played with. The more flexible systems implement 
a meta-language, a logical framework, that gives support for the implementation of 
multiple logics. Within a logical framework, the logic and axioms of a proof assistant 
can themselves be formalized, and machine translation^ can be constructed between 
different foundations of mathematics LIRllj . 



type theory. Approaching the subject of formal proofs as a mathematician whose prac- 
tice was shaped by Zermelo-Fraenkel set theory, I first treated types as nothing more 
than convenient identifying labels (such real number, natural number, list of integers, 
or boolean) attached to terms, like the PLU stickers on fruit that get peeled away before 
consumption. Types are familiar from programming languages as a way of identifying 
what data structure is what. In the simple type system of HOL Light, to each term is 
affixed a unique type, which is either a primitive type (such as the boolean type bool), 
a type variable (A, B,C, . . .), or inductively constructed from other types with the arrow 
constructor (A — > B, A — > {bool — > C), etc.). There is also a way to create subtypes of 
existing types. If the types are interpreted naively as sets, then x:A asserts that the term 
x\s a. member of A, and f : A ^ B asserts that / is a member of A — > B, the set of 
functions from A to B. 

In untyped set theory, it is possible to ask ridiculous questions such as whether the 
real number n - 3.14 . . ., when viewed as a raw set, is a finite group. In fact, in a 
random exploration of set theory, like a monkey composing sonnets at the keyboard, 
ridiculous questions completely overwhelm all serious content. Types organize data on 
the computer in meaningful ways to cut down on the static noise in the system. The 
question about n and groups is not well-typed and cannot be asked. Russell's paradox 
also disappears: X i X is not well-typed. For historical reasons, this is not surprising: 
Russell and Whitehead first introduced types to overcome the paradoxes of set theory, 
and from there, through Church, they passed into computer science. 

Only gradually have I come to appreciate the significance of a comprehensive theory 
of types. The type system used by a proof assistant determines to a large degree how 
much of a proof the user must contribute and how much the computer automates behind 
the scenes. The type system is decidable if there is a decision procedure to determine 
the type of each term. 

A type system is dependent if a type can depend on another term. For example, 
Euclidean space R", depends on its dimension n. For this reason, Euclidean space is 
most naturally implemented in a proof assistant as a dependent type. In a proof assistant 
such as HOL Light that does not have dependent types, extra work is required to develop 
a Euclidean space hbrary. 



My long term Flyspeck project seeks to give a formal proof of the Kepler conjecture IHal05al . 
This project is now scattered between different proof assistants. Logical framework based 
translations between proof assistants gives me hope that an automated tool may assemble the 
scattered parts of the project. 
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2.2 propositions as types 

I mentioned the naive interpretation of each type A as a set and a term x:A as a member 
of the set. A quite different interpretation of types has had considerable influence in 
the design of proof assistants. In this "terms-as-proofs" view, a type A represents a 
proposition and a term x:A represents a proof of the proposition A. A term with an 
arrow type, / : A — > B, can be used to to construct a proof f(x) of B from a proof x of 
A. In this interpretation, the arrow is logical implication. 

A further interpretation of types comes from programming languages. In this "terms- 
as -computer-programs" view, a term is a program and the type is its specification. For 
example, / : A — > B is a program / that takes input of type A and returns a value of 
type B. 

By combining the "terms as proofs" with the "terms as computer programs" in- 
terpretations, we get the famous Curry-Howard correspondence that identifies proofs 
with computer programs and identifies each proposition with the type of a computer 
program. For example, the most fundamental rule of logic, 

(modusponens) 

(from A and A-implies-Z? follows B) is identified with the function application in a com- 
puter program; from x:A and f : A ^ B we get f{x):B. To follow the correspondence 
is to extract an executable computer program from a mathematical proof. The Curry- 
Howard correspondence has been extremely fruitful, with a multitude of variations, 
running through a gamut of proof systems in logic and identifying each with a suitable 
programming domain. 



2.3 proof tactics 

In some proof assistants, the predominant proof style is a backward style proof. The 
user starts with a goal, which is a statement to be proved. In interactive steps, the user 
reduces the goal to successively simpler goals until there is nothing left to prove. 

Each command that reduces a goal to simpler goals is called a tactic. For example, in 
the proof assistant HOL Light, there are about 100 different commands that are tactics or 



higher-order operators on tactics (called tacticals). Figure 18 shows the most commonly 
used proof commands in HOL Light. The most common tactic is rewriting, which takes 
a theorem of the form a - b and substitutes b for an occurrence of a in the goal. 



In the Coq proof assistant, the tactic system has been streamlined to an extraordi- 
nary degree by the SSReflect package, becoming a model of efliciency for other proof 
assistants to emulate, with an extremely small number of tactics such as the move tac- 
tic for bookkeeping, one for rewriting, ones for forward and backward reasoning, and 
another for case analysis IIGMlll . MGMTl 11 . The package also provides support for 
exploiting the computational content of proofs, by integrating logical reasoning with 
eflicient computational algorithms. 
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name 


purpose 


usage 


THEN 

RbWRllb 

MP_TAC 

SIMP_TAC 

MATCHJVIP.TAC 

STRIP.TAC 

MESON_TAC 

REPEAT 

DISCH_TAC 

EXISTS _TAC 

GEN.TAC 


combine two tactics into one 

use a — b to replace a with b in goal 

introduce a previously proved theorem 

rewriting with conditionals 

reduce a goal b to a, given a theorem a => b 

(bookkeeping) unpackage a bundled goal 

apply first-order reasoning to solve the goal 

repeat a tactic as many times as possible 

(bookkeeping) move hypothesis to the assumption lis 

instantiate an existential goal 3x . . . 

instantiate a universal goal Vx . . . 


37.2% 
14.5% 
4.0% 
3.1% 
3.0% 
2.9% 
2.6% 
2.5% 
t 2.3% 
2.3% 
1.4% 



Fig. 18. A few of the most common proof commands in the HOL Light proof assistant 



2.4 first-order automated reasoning 

Many proof assistants support some form of automated reasoning to relieve the user 
of doing rote logic by hand. For example, Table 18 lists meson (an acronym for Love- 
land's Model Elimination procedure), which is HOL Light's tactic for automated rea- 
soning IIHar09l Sec. 3.15], IIHar96 1 . The various automated reasoning tools are gener- 
ally ^rif-ort/er theorem provers. The classic resolution algorithm for first-order reason- 
ing is illustrated in a box (Proof by Resolution). 
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Proof by Resolution 



Resolution is the granddaddy of automated reasoning in first-order logic. The reso- 
lution rule takes two disjunctions 

P V A and V B 

and concludes 

A' VB', 

where A' and B' are the specializations of A and B, respectively, under the most 
general unifier of P and P' . (Examples of this in practice appear below.) 

This box presents a rather trivial example of proof by resolution, to deduce the easy 
theorem asserting that every infinite set has a member. The example will use the 
following notation. Let be a constant representing the empty set and constant c 
representing a given infinite set. We use three unary predicates e, f, i that have 
interpretations 

e{X) "Z is empty", f{X) "Z is finite", i{X) "Z is infinite." 

The binary predicate (e) denotes set membership. We prove i{c) => (3z.z € c) "an 
infinite set has a member" by resolution. 

To argue by contradiction, we introduce the hypothesis i(c) and the negated conclu- 
sion -i(Z e c) as axioms. Here are the axioms that we allow in the deduction. The 
axioms have been preprocessed, stripped of quantifiers, and written as a disjunction 
of literals. Upper case letters are variables. 



Axiom Informal Description 

1 . He) Assumption of desired theorem. 

2. -■(Zee) Negation of conclusion of desired theorem. 

3. e(X) V (m(Z) e Z) A nonempty set has a member. 

4. e(0) The empty set is empty. 

5. /(0) The empty set is finite. 

6. -'iiY) V -1 f{Y) A set is not both finite and infinite. 

7. -^e{U) V -ie(y) V -ij(f/) V i{V) Weak indistinguishabiUty of empty sets. 



Here are the resolution inferences from this list of axioms. The final step obtains the 
desired contradiction. 



Inference Resolvant 

8. (resolving 2,3, unifying Z with c and m(Z) with Z) e{c) 

9. (resolving 7,8, unifying U with c) -'eiV) V -./(c) V i{V) 

10. (resolving 1,9) -.e(V) V i(V) 

1 1 . (resolving 4, 10, unifying V with 0) j(0) 

12. (resolving 6, 11 , unifying Y with 0) -'/(0) 

13. (resolving 12,5) ± 



Q.E.D. 
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Writing about first-order automated reasoning, Huet and Paulin-Mohring IIBC04I 
describe the situation in the early 1970s as a "catastrophic state of the art." "The stan- 
dard mode of use was to enter a conjecture and wait for the computer's memory to ex- 
haust its capacity. Answers were obtained only in exceptionally trivial cases." They go 
on to describe numerous developments (Knuth-Bendix, LISP, rewriting technologies, 
LCF, ML, Martin-Lof type theory, NuPrl, Curry-Howard correspondence, dependent 
types, etc.) that led up to the Coq proof assistant. These developments led away from 
first-order theorem proving with its "thousands of unreadable logical consequences" to 
a highly structured approach to theorem proving in Coq. 

First-order theorem proving has developed significantly over the years into sophis- 
ticated software products. They are no longer limited to "exceptionally limited cases." 
Many different software products compete in an annual competition (CASC), to see 
which can solve difficult first-order problems the fastest. The LTB (large theory batch) 
division of the competition includes problems with thousands of axioms IIPSST08I . Sig- 
nificantly, this is the same order of magnitude as the total number of theorems in a proof 
assistant. What this means is that a first-order theorem provers have reached the stage 
of development that they might be able to give fully automated proofs of new theorems 
in a proof assistant, working from the full library of previously proved theorems. 

sledgehammer. The Sledgehammer tactic is Paulson's implementation of this idea of 
full automation in the Isabelle/HOL proof assistant llPaulOI . As the name 'Sledgeham- 
mer' suggests, the tactic is all-purpose and powerful, but demolishes all higher math- 
ematical structure, treating every goal as a massive unstructured problem in first-order 
logic. If L is the set of all theorems in the Isabelle/HOL library, and g is a goal, it would 
be possible to hand off the problem L § to a first-order theorem prover However, 
success rates are dramatically improved, when the theorems in L are first assessed by 
heuristic rules for their likely relevance for the goal g, in a process called relevance 
filtering. This filtering is used to reduce L to an axiom set L' of a few hundred theorems 
that are deemed most likely to prove g. 

The problem L' => g is stripped of type information, converted to a first-order, and 
fed to first-order theorem provers. Experiments indicate that it is more effective to feed 
a problem in parallel into multiple first-order provers for a five-second burst than to 
hand the problem to the best prover (Vampire) for a prolonged attack IPaulOI . IBNIOL 
When luck runs in your favor, one of the first-order theorem provers finds a proof. 

The reconstruction of a formal proof from a first-order proof can encounter hurdles. 
For one thing, when type information is stripped from the problem (which is done to 
improve performance), soundness is lost. "In unpublished work by Urban, MaLARea 
[a machine learning program for relevance ranking] easily proved the full Sledgeham- 
mer test suite by identifying an inconsistency in the translated lemma library; once 
MaLARea had found the inconsistency in one proof, it easily found it in all the oth- 
ers" IPauTOl. IIUrb07l . Good results have been obtained in calling the first-order prover 
repeatedly to find a smaller set of axioms L" c L' that imply the goal g. A manage- 
ably sized set L" is then passed to the metis tacticj^in Isabelle/HOL, which constructs 
a formal proof L" => g from scratch. 



' Metis is a program that automates first-order reasoning IMetl . 
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Bohme and Nipkow took 1240 proof goals that appear in several diverse theories 
of the Isabelle/HOL system and ran sledgehammer on all of them IIBNIOI . The results 
are astounding. The success rate (of obtaining fully reconstructed formal proofs) when 
three different first-order provers run for two-minutes each was 48%. The proofs of 
these same goals by hand might represent years of human labor, now fully automated 
through a single new tool. 

Sledgehammer has led to a new style of theorem proving, in which the user is pri- 
marily responsible for stating the goals. In the final proof script, there is no exphcit men- 
tion of sledgehammer Metis proves the goals, with sledgehammer operating silently in 
the background to feed metis with whatever theorems it needs. For example, a typical 
proof script might contain lines such as MPaulOl 

hence "x c space M" 
by (metis sets into space lambda system sets) 

The first line is the goal that the user types. The second line has been automatically 
inserted into the proof script by the system, with the relevant theorems sets , into 
etc. selected by Sledgehammer. 

2.5 computation in proof assistants. 

One annoyance of formal proof systems is the difficulty in locating the relevant theo- 
rems. At last count, HOL Light had about 14, 000 theorems and nearly a thousand pro- 
cedures for proof construction. Larger developments, such as Mizar, have about twice 
as many theorems. Good search tools have somewhat relieved the burden of locating 
theorems in the libraries. However, as the formal proof systems continue to grow, it 
becomes ever more important to find ways to use theorems without mentioning them 
by name. 

As an example of a feature which commendably reduces the burden of memoriz- 
ing long lists of theorem names, I mention the REAL_RING command in HOL Light, 
which is capable of proving any system of equalities and inequalities that holds over an 
arbitrary integral domain. For example, 1 can give a one-line formal proof of an isogeny 
(xi,yi) ix2,y2) of elliptic curves: if we have a point on the first elliptic curve; 

y\ - 1 + ax\ + bx\, 
xiyi = Xu 
y2yl^(l-bx% 

then ix2,y2) lies on a second elliptic curve 

^2 = 1 + Cl'x2 + b'x2, 

where a' = -2a and b' - - 4b. In the proof assistant, the input of the statement is 
as economical as what I have written here. We expect computer algebra systems to be 
capable of checking identities like this, but to my amazement, I found it easier to check 
this isogeny in HOL Light than to check it in Mathematica. 
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The algorithm works in the following manner A universally quantified system of 
equalities and inequalities holds over all integral domains if and only if it holds over all 
fields. By putting the formula in conjunctive normal form, it is enough to prove a finite 
number of polynomial identities of the form: 

= 0) V ■ ■ ■ V = 0) V (^I ^ 0) V ■ ■ ■ V {qt + 0). (2) 

An element in a field is zero, if and only if it is not a unit. Thus we may rewrite each 
polynomial equality - as an equivalent inequality 1 - + 0. Thus, without 
loss of generality, we may assume that n = 0; so that all disjuncts are inequalities. The 
formula (j2|i is logically equivalent to 

(g'l = 0) A ■ ■ ■ A (^i; = 0) ^ false. 

In other words, it is enough to prove that the zero set of the ideal / - {qi, . . . ,q„) is 
empty. For this, we may use Grobner base^to prove that 1 e /, to certify that the zero 
set is empty. 

Grobner basis algorithms give an example of a certificate-producing procedure. A 
formal proof is obtained in two stages. In the first stage an unverified algorithm pro- 
duces a certificate. In the second stage the proof assistant analyzes the certificate to 
confirm the results. Certificate-producing procedures open the door to external tools, 
which tremendously augment the power of the proof assistant. The meson is procedure 
implemented this way, as a search followed by verification. Other certificate-producing 
procedures in use in proof assistants are linear programming, SAT, and SMT. 

Another praiseworthy project is Kaliszyk and Wiedijk's implementation of a com- 
puter algebra system on top of the proof assistant HOL Light. It combines the ease of 
use of computer algebra with the rigor of formal proof IIKW07L Even with its nota- 
tional idiosyncrasies (& and # as a markers of real numbers, Cx as a marker of complex 
numbers, ii for V-T, and -- for unary negation), it is the kind of product that I can 
imagine finding widespread adoption by mathematicians. Some of the features of the 



system are shown in Figure 19 



2.6 formalization of finite group tlieory 

The Feit-Thompson theorem, or odd-order theorem, is one of the most significant theo- 
rems of the twentieth century. (For his work, Thompson was awarded the three highest 
honors in the mathematical world: the Fields Medal, the Abel Prize, and the Wolf Prize.) 
The Feit-Thompson theorem states that every finite simple group has even order, except 
for cyclic groups of prime order The proof, which runs about 250 pages, is extremely 
technical. The Feit-Thompson theorem launched the endeavor to classify all finite sim- 
ple groups, a monumental undertaking that consumed an entire generation of group 
theorists. 

Gonthier's team has formalized the proof of the Feit-Thompson theorem IIGonl2L 
To me as a mathematician, nothing else that has been done by the formal proof com- 
munity compares in splendor to the formalization of this theorem. Finally, we are doing 

* Kaliszyk's benchmarks suggest that the Grobner basis algorithm in the proof assistant Isabelle 
runs about twenty times faster than that of HOL Light. 
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Fig. 19. Interaction with a formally verified computer algebra system IKW07I . 



real mathematics! The project formahzed two books, MBG94I and OPetOOI . as well as a 
significant body of background material. 

The structures of abstract algebra - groups, rings, modules, algebras, algebraically 
closed fields and so forth - have all been laid out formally in the Coq proof assistant. 
Analogous algebraic hierarchies appear in systems such as OpenAxiom, MathScheme, 
Mizar, and Isabelle; and while some of these hierarchies are elaborate, none have delved 
so deeply as the development for Feit-Thompson. It gets multiple abstract structures to 
work coherently together in a formal setting. "The problem is not so much in capturing 
the semantics of each individual construct but rather in having all the concepts working 
together well" IIGMR07L 

Structure finGroupType Type := FinGroupType { 
element :> finType; 
1 : element; 



unitP 
invP 
mulP 



element — > element; 

element — > element — > element; 

V X, 1 * X = X ; 
Vx, X"' * X = 1 ; 

Vxi X2 X3, Xi * (X2 * X3) = (Xi * X2) * X3 



Fig. 20. The structure of a finite group IGMR07I . 



The definition of a finite group in Coq is similar to the textbook definition, expressed 
in types and structures (Figure [20)i. It declares a finite type called element that is the 
group carrier or domain. The rest of the structure specifies a left-unit element 1, a left- 
inverse and an associative binary operation (*). 
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Other aspects of Gonthier's recent work can be found at BGonl II . MGGMR09L 
IIBGBP08I . Along different lines, a particularly elegant organization of abstract alge- 
bra and category theory is obtained with type classes ISvdWlll . 

2.7 homotopy type theory 

The simple type theory of HOL Light is adequate for real analysis, where relatively few 
types are needed - one can go quite far with natural numbers, real numbers, booleans, 
functions between these types, and a few functionals. However, the dependent type 
theory of Coq is better equipped than HOL Light for the hierarchy of structures from 
groups to rings of abstract algebra. But even Coq's type theory is showing signs of strain 
in dealing with abstract algebra. For instance, an unpleasant limitation of Coq's theory 
of types is that it lacks the theorem of extensionality for functions: if two functions take 
the same value for every argument, it does not follow that the two functions are equaln 




The gymnastics to solve the problem of function extensionality in the context of the 
Feit-Thompson theorem are found in ItGMROT j . 

A lack of function extensionality is an indication that equality in type theory may 
be misconceived. Recently, homotopy type theory has exploded onto the scene, which 
turns to homotopy theory and higher categories as models of type theory IHTTl II . It is 
quite natural to interpret a dependent type (viewed as a family of types parametrized by 
a second type) topologically as a fibration (viewed as a family of fibers parametrized 
by a base space) I AW09il . Voevodsky took the homotopical notions of equality and 
equivalence and translated them back into type theory, obtaining the univalence axiom 
of type theory, which posits what types are equivalent f Voel II . MPW12I . IIKLV12bL 
IIKLV12al . One consequence of the univalence axiom is the theorem of extensionality 
for functions. Another promising sign for computer theorem-proving apphcations is 
that the univalence axiom appears to preserve the computable aspects of type theory 
(unlike for instance, the axiom of choice which makes non-computable choices) ILHL 
We may hope that some day there may be a back infusion of type-theoretic proofs into 
homotopy theory. 

2.8 language of mathematics 

Ganesalingam's thesis is the most significant linguistic study of the language of math- 
ematics to date IIGan09i . IGanlOI . Ganesalingam was awarded the 201 1 Beth Prize for 
the best dissertation in Logic, Language, or Information. Although this research is still 
at an early stage, it suggests that the mechanical translation of mathematical prose into 
formal computer syntax that faithfully represents the semantics is a realistic hope for 
the not-to-distant future. 

The linguistic problems surrounding the language of mathematics differ in various 
ways from those of say standard English. A mathematical text introduces new defini- 
tions and notations as it progresses, whereas in English, the meaning of words is gen- 
erally fixed from the outset. Mathematical writing freely mixes English with symbolic 
expressions. At the same time, mathematics is self-contained in a way that English can 




' HOL Light avoids this problem by positing extensionality as a mathematical axiom. 
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never be; to understand English is to understand the world. By contrast, the meaning in 
a carefully written mathematical text is determined by Zermelo-Fraenkel set theory (or 
your favorite foundational system). 

Ganesalingam's analysis of notational syntax is general enough to treat quite gen- 
eral mixfix operations generalizing infix (e.g. +), postfix (e.g. factorial !), and prefix 
(cos). He analyzes subscripted infix operators (such as a semidirect product H x^, A^), 
multi-symboled operators (such as the three-symboled [ : ] operator for the de- 
gree {K : A:] of a field-extension), prefixed words (7?-module), text within formulas 
{(fl, /?) I a is a factor of b}, unusual script placement ^G, chained relations a < b < c, 
ellipses \ + 2 + ■ ■ ■ + n, contracted forms x, y e N, and exposed formulas (such as "for 
all X > 0, ... " to mean "for all jc, if x > 0, then . . . "). 

The thesis treats what is called the formal mode of the language of mathematics 
- the language divested of all the informal side-remarks. The syntax is treated as a 
context-free grammar, and the semantics are analyzed with a variant of discourse rep- 
resentation theory, which in my limited understanding is something very similar to 
first-order logic; but different in one significant aspect; it provides a theory of pronoun 
references; or put more precisely, a theory of what may be the "legitimate antecedent 
for anaphor" 

A major issue in Ganesalingam's thesis is the resolution of ambiguity. For example, 
in the statement 

P is prime (3) 

the term 'prime' may mean prime number, prime ideal, or prime manifold. His solution 
is to attach type information to terms (in the sense of types as discussed above). The 
reading of ([3]l depends on the type of P, variously a number, a subset of a ring, or a 
manifold. In this analysis, resolution of ambiguity becomes a task of a type inference 
engine. 

Because of the need for type information, Ganesalingam raises questions about the 
suitability of Zermelo-Fraenkel set theory as the ultimate semantics of mathematics. A 
number of formal-proof researchers have been arguing in favor of typed foundational 
systems for many years. It is encouraging that there is remarkable convergence between 
Ganesalingam's linguistic analysis, innovations in the Mizar proof assistant, and the de- 
velopment of abstract algebra in Coq. For example, in various camps we find ellipses 
(aka big operators), mixfix operators, type inference, missing argument inference mech- 
anisms, and so forth. Also see [Hoe 11 J and [PasOTJ. Mathematical abuses of notation 
have turned out to be rationally construed after all! 

2.9 looking forward 

Let's take the long term view that the longest proofs of the last century are of insignifi- 
cant complexity compared to what awaits. Why would we limit our creative endeavors 
to 10,000 page proofs when we have tools that allow us to go to a million pages or 
more? So far it is rare for a computer proof has defied human understanding. No human 
has been able to make sense of an unpublished 1500 page computer-generated proof 
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about Bruck loop^ llPSOSl . Eventually, we will have to content ourselves with fables 
that approximate the content of a computer proof in terms that humans can comprehend. 

Turing's great theoretical achievements were to delineate what a computer can do 
in the concept of a universal Turing machine, to establish limits to what a computer 
can do in his solution to the Entscheidungsproblem, and yet to advocate nonetheless 
that computers might imitate all intelligent activity. It remains a challenging research 
program: to show that one limited branch of mathematics, computation, might stand for 
all mathematical activity. 

In the century since Turing's birth, the computer has become so ubiquitous and the 
idea of computer as brain so commonplace that it bears repeating that we must still 
think very long and hard about how to construct a computer that can imitate a living, 
thinking mathematician. 

Proof assistant technology is still under development in labs; far more is needed 
before it finds widespread adoption. Ask any proof assistant researcher, and you will 
get a sizable list of features to implement: more automation, better libraries, and better 
user interfaces ! Wiedijk discusses ten design questions for the next generation of proof 
assistants, including the type system, which axiomatic foundations of mathematics to 
use, and the language of proof scripts IWielObI . 

Everyone actively involved in proof formalization experiences the incessant barrage 
of problems that have been solved multiple times before and that other users will have 
to solve multiple times again, because the solutions are not systematic. To counter this, 
the DRY "Don't Repeat Yourself" principle of programming, formulated in IIHTOOL 
has been carried to a refreshing extreme by Carette in his proof assistant design. For 
example, in his designs, a morphism is defined only once, eliminating the need for 
separate definitions of a morphism of modules, of algebras, of varieties, and so forth. 
Carette's other design maxims include "math has a lot of structure; use it" and "abstract 
mathematical structures produce the best code" ICESlll . Indeed, mathematicians turn 
to abstraction to bring out relevant structure. This applies to computer code and math- 
ematical reasoning alike. American Math Society guidelines for mathematical writing 
apply directly to the computer: "omit any computation which is routine. . . . Merely indi- 
cate the starting point, describe the procedure, and state the outcome" liDCF^62j (except 
that computations should be automated rather than entirely omitted). 

We need to separate the concerns of construction, maintenance, and presentation 
of proofs. The construction of formal proofs from a mathematical text is an extremely 
arduous process, and yet I often hear proposals that would increase the labor needed to 
formalize a proof, backed by secondary goals such as ease of maintenance, elegance of 
presentation, fidelity to printed texts, and pedagogy|^ Better to avail ourselves of au- 
tomation that was not available in the day of paper proofs, and to create new mathemat- 



The theorem states that Bmck loops with abelian inner mapping group are centrally nilpotent 
of class two. 

" To explain the concept of separation of concerns, Dijkstra tells the story of an old initiative to 
create a new programming language that failed miserably because the designers felt that the 
new language had to look just like FORTRAN to gain broad acceptance. "The proper technique 
is clearly to postpone the concerns for general acceptance until you have reached a result of 
such a quality that it deserves acceptance" [Dij82J . 
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ical styles suited to the medium, with proofs that variously look like a computer-aided 
design session, a functional program, or a list of hypotheses as messages in gmail. The 
most pressing concern is to reduce the skilled labor it takes a user to construct a formal 
proof from a pristine mathematical text. 

The other concerns of proof transformation should be spun off as separate research 
activities: refactored proofs, proof scripts optimized for execution time, translations 
into other proof assistants, natural language translations, natural language abstracts, 
probabihstically checkable proofs, searchable metadata extracts, and proof mining. 

For a long time, proof formalization technology was unable to advance beyond the 
mathematics of the 19th century, picking classical gems such as the Jordan curve the- 
orem, the prime number theorem, or Dirichlet's theorem on primes in arithmetic pro- 
gressions. With the Feit-Thompson theorem, formalization has risen to a new level, by 
taking on the work of a Fields medalist. 

At this level, there is an abundant supply of mathematical theorems to choose from. 
A Dutch research agenda lists the formalization of Fermat's Last Theorem as the first 
in a list of "Ten Challenging Research Problems for Computer Science" BBerOSI . Hes- 
selink predicts that this one formalization project alone will take about "fifty years, with 
a very wide margin." Small pieces of the proof of Fermat, such as class field theory, the 
Langlands-Tunnell theorem, or the arithmetic theory of elliptic curves would be a fitting 
starting point. The aim is to develop technologies until formal verification of theorems 
becomes routine at the level of Atiyah-Singer index theorem, Perelman's proof of the 
Poincare conjecture, the Green- Tao theorem on primes in arithmetic progression, or 
Ngo's proof of the fundamental lemma. 

Starting from the early days of Newell, Shaw, and Simon's experiments, researchers 
have dreamed of a general-purpose mechanical problem solver. Generations later, af- 
ter untold trials, it remains an unwavering dream. 1 will end this section with one of 
the many proposals for a general problem solving algorithm. Kurzweil breaks general 
problem solving into three phases: 

1. State your problem in precise terms. 

2. Map out the contours of the solution space by traversing it recursively, within the 
limits of available computational resources. 

3. Unleash an evolutionary algorithm to configure a neural net to tackle the remaining 
leaves of the tree. 

He concludes, "And if all of this doesn't work, then you have a difficult problem in- 
deed" l|Kur99l ■ Yes, indeed we do! Some day, energy and persistence will conquer. 
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3 Issues of Trust 

We all have first-hand experience of the bugs and glitches of software. We exchange sto- 
ries when computers run amok. Science recently reported the story of a textbook "The 
Making of a Fly" that was on sale at Amazon for more than 23 million dollars IScil IL 
The skyrocketing price was triggered by an automated bidding war between two sellers, 
who let their algorithms run unsupervised. The textbook's author, Berkeley professor 
Peter Lawrence, said he hoped that the price would reach "a billion." An overpriced 
textbook on the fly is harmless, except for students who have it as a required text. 

But what about the Flash Crash on Wall Street that brought a 600 point plunge in the 
Dow Jones in just 5 minutes at 2:41 pm on May 6, 2010? According to the New York 
Times |NYT10|, the flash crash started when a mutual fund used a computer algorithm 
"to sell $4.1 billion in futures contracts." The algorithm was designed to sell "without 
regard to price or time [A]s the computers of the high-frequency traders traded [fu- 
tures] contracts back and forth, a 'hot potato' eff'ect was created." When computerized 
traders backed away from the unstable markets, share prices of major companies fluc- 
tuated even more wildly. "Over 20,000 trades across more than 300 securities were ex- 
ecuted at prices more than 60% away from their values just moments before" ISE CIOL 
Throughout the crash, computers followed algorithms to a T, to the havoc of the global 
economy. 

3.1 mathematical error 

Why use computers to verify mathematics? The simple answer is that carefully imple- 
mented proof checkers make fewer errors than mathematicians (except J. -P. Serre). 

Incorrect proofs of correct statements are so abundant that they are impossible to 
catalogue. Ralph Boas, former executive editor of Math Reviews, once remarked that 
proofs are wrong "half the time" IIAusOSII . Kempe's claimed proof of the four-color 
theorem stood for more than a decade before Heawood refuted it [MacOl! p. 115]. 
"More than a thousand false proofs [of Fermat's Last Theorem] were published between 
1908 and 1912 alone" MCorl OI. Many published theorems are like the hanging chad 
ballots of the 2000 U.S. presidential election, with scrawls too ambivalent for a clear yea 
or nay. One mathematician even proposed to me that a new journal is needed that unlike 
the others only publishes reliable results. Euclid gave us a method, but even he erred in 
the proof of the very first proposition of the Elements when he assumed without proof 
that two circles, each passing through the other's center, must intersect. The concept 
that is needed to repair the gap in Euclid's reasoning is an intermediate value theorem. 
This defect was not remedied until Hilbert's 'Foundations of Geometry.' 

Examples of widely accepted proofs of false or unprovable statements show that 
our methods of proof-checking are far from perfect. Lagrange thought he had a proof 
of the parallel postulate, but had enough doubt in his argument to withhold it from 
publication. In some cases, entire schools have become sloppy, such as the Italian 
school of algebraic geometry or real analysis before the revolution in rigor towards 
the end of the nineteenth century. Plemelj's 1908 accepted solution to Hilbert's 21st 
problem on the monodromy of linear differential equations was refuted in 1989 by 
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Bolibruch. Auslander gives the example of a theoreiip]published by Waraskiewicz in 
1937, generalized by Choquet in 1944, then refuted with a counterexample by Bing in 
1948 [AusOSl. Another example is the approximation problem for Sobolev maps be- 
tween two manifolds lBet9rj . which contains a faulty proof of an incorrect statement. 
The corrected theorem appears in MHL03I . Such examples are so plentiful that a Wiki 
page has been set up to classify them, with references to longer discussions at Math 
Overflow [WikllJ. |Ov e09 |. [ OvelO ]. 

Theorems that are calculations or enumerations are especially prone to error. Feyn- 
man laments, "I don't notice in the morass of things that something, a little limit or sign, 

goes wrong I have mathematically proven to myself so many things that aren't true" 

I FeyOO p. 885] . Elsewhere, Feynman describes two teams of physicists who carried out 
a two-year calculation of the electron magnetic moment and independently arrived at 
the same predicted value. When experiment disagreed with prediction, the discrepancy 
was eventually traced to an arithmetic error made by the physicists, whose calcula- 



tions were not so independent as originally believed |Fey85 p. 117]. Pontryagin and 



Rokhlin erred in computing stable homotopy groups of spheres. Little's tables of knots 
from 1885 contains duplicate entries that went undetected until 1974. In enumerative 
geometry, in 1848, Steiner counted 7776 plane conies tangent to 5 general plane conies, 
when there are actually only 3264. One of the most persistent blunders in the history of 
mathematics has been the misclassification (or misdefinition) of convex Archimedean 
polyhedra. Time and again, the pseudo rhombic cuboctahedron has been overlooked or 



illogically excluded from the classification (Figure 2 1 1 [Griil U . 




Fig. 21. Throughout history, the pseudo rhombic cuboctahedron has been overlooked or misclas- 
sified. 



3.2 In HOL Light we trust 

To what extent can we trust theorems certified by a proof assistant such as HOL Light? 
There are various aspects to this question. Is the underlying logic of the system con- 
sistent? Are there any programming errors in the implementation of the system? Can a 
devious user find ways to create bogus theorems that circumvent logic? Are the under- 
lying compilers, operating system, and hardware reliable? 

As mentioned above, formal methods represent the best cumulative effort of lo- 
gicians, computer scientists and mathematicians over the decades and even over the 

The claim was that every homogeneous plane continuum is a simple closed curve. 
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centuries to create a trustworthy foundation for the practice of mathematics, and by 
extension, the practice of science and engineering. 

3.3 a network of mutual verification 

John Harrison repeats the classical question "Quis custodiet ipsos custodes" - who 
guards the guards IIHar06l ? How do we prove the correctness of the prover itself? In 
that article, he proves the consistency of the HOL Light logic and the correctness of its 
implementation in computer code. He makes this verification in HOL Light itself! To 
skirt Godel's theorem, which implies that HOL Light - if consistent - cannot prove its 
own consistency, he gives two versions of his proof. The first uses HOL Light to verify 
a weakened version of HOL Light that does not have the axiom of infinity. The second 
uses a HOL Light with a strengthened axiom of infinity to verify standard HOL Light. 

Recently, Adams has implemented a version of HOL called HOL Zero. His system 
has the ability to import mechanically proofs that were developed in HOL Light |Ada09l- 
He imported the self-verification of HOL Light, to obtain an external verification. You 
see where this is going. As mechanical translation capabilities are developed for proof 
assistants, it becomes possible for different proof assistants to share consistency proofs, 
similar to the way that different axiomatic systems give relative consistency proofs of 
one another. We are headed in the direction of knowing that if the logic or implemen- 
tation of one proof assistant has an error, then all other major proof assistants must fail 
in tandem. Other self-verification projects are Coq in Coq (Coc) and ACL2 in ACL2 
(Milawa) l|Bar98l . l|Dav09J . 

3.4 hacking HOL 

Of course, every formal verification project is a verification of an abstract model of the 
computer code, the computer language, and its semantics. In practice, there are gaps 
between the abstract model and implementation. 

This leaves open the possibility that a hacker might find ways to create an unau- 
thorized theorem; that is, a theorem generated by some means other than the rules of 
inference of HOL Logic. Indeed, there are small openings that a hacker can exploitPj 




Adams maintains a webpage of known vulnerabilities in his system and offers a cash 
bounty to anyone who uncovers a new vulnerability. 

These documented vulnerabilities need to be kept in perspective. They lie at the 
fringe of the most reliable software products ever designed. Proof assistants are used to 
verify the correctness of chips and microcode IIFox03l . operating system kernels IIKAE^lOl . 

For example, strings are mutable in HOL Light's source language. Objective CAML, allow- 
ing theorems to be maliciously altered. Also, Objective CAML has object magic, which is a 
way to defeat the type system. These vulnerabilities and all other vulnerabilities that I know 
would be detected during translation of the proof from HOL Light to HOL Zero. A stricter 
standard is Pollack consistency, which requires a proof assistant to avoid the appearance of 
inconsistency |Ada09l , IWielOal . For example, some proof assistants allow the substitution of 
a variable whose name is a meaningless sequence of characters 'n<0 A Q' for t'm3n.t < n 
to obtain a Pollack-inconsistency 3n. n <0 A < n. 




37 



compilers MLer06L safety-critical software such as aircraft guidance systems, security 
protocols, and mathematical theorems that defeat the usual refereeing process. 

Some take the view that nothing short of absolute certainty in mathematics gives 
an adequate basis for science. Poincare was less exactinj*^ only demanding the im- 
precision of calculation not to exceed experimental error. As Harrison reminds us, "a 
foundational death spiral adds little value" IHarlOI . 



3.5 soft errors 

Mathematicians often bring up the "cosmic ray argument" against the use of computers 
in math. Let's look at the underlying science. 

A soft error in a computer is a transient error that cannot be attributed to permanent 
hardware defects nor to bugs in software. Hard errors - errors that can be attributed 
to a lasting hardware failure - also occur, but at rates that are ten times smaller than 
soft errors |MW04|. Soft errors come from many sources. A typical soft error is caused 
by cosmic rays, or rather by the shower of energetic neutrons they produce through 
interactions in the earth's atmosphere. A nucleus of an atom in the hardware can capture 
one of these energetic neutrons and throw off an alpha particle, which strikes a memory 
circuit and changes the value stored in memory. To the end user, a soft error appears as 
a gremlin, a seemingly inexplicable random error that disappears when the computer is 
rebooted and the program runs again. 

As an example, we will calculate the expected number of soft errors in one of the 



mathematical calculations of Section 1.17 The Atlas Project calculation of the Eg, char- 



acter table was a 77 hour calculation that required 64 gigabytes RAM iAda07l . Soft 
errors rates are generally measured in units of failures-in-time (FIT). One FIT is de- 
fined as one error per 10^ hours of operation. If we assume a soft error rate of 10^ FIT 
per Mbit, (which is a typical rate for a modern memory device operating at sea levej*^ 
||Tez04|), then we would expect there to be about 40 soft errors in memory during the 
calculation: 

10^ FIT 10^ errors , 

64 GB-77 hours = (64 ■ 8 ■ 10^ Mbit)-77 hours ^ 39.4 errors. 

1 Mbit 10^ hours Mbit 

This example shows that soft errors can be a realistic concern in mathematical calcula- 
tions. (As added confirmation, the calculation has now been repeated about 5 times 
with identical results.) 

In software that has been thoroughly debugged, soft errors become the most sig- 
nificant source of error in computation. Although there are numerous ways to pro- 
tect against soft errors with methods such as repeated calculations and error-correcting 
codes, hardware redesign carries an economic cost. In fact, soft errors are on the rise 
through miniaturization: a smaller circuit generally has a lower capacitance and re- 
sponds to less energetic alpha particles than a larger circuit. 

"II est done inutile de demander au calcul plus de precision qu'aux observations; mais on ne 
doit pas non plus lui en demander moins" |Poi92| . 

The soft error rate is remarkably sensitive to elevation; a calculation in Denver produces about 
three times more soft errors than the same calculation on identical hardware in Boston. 
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Soft errors are depressing news in the ultra-reliable world of proof assistants. Alpha 
particles rain on perfect and imperfect software alike. In fact, because the number of 
soft errors is proportional to the execution time of a calculation, by being slow and 
methodical, the probability of a soft error during a calculation inside a proof assistant 
can be much higher than the probability when done outside. 

Soft errors and susceptibility to hacking have come to be more than a nuisance 
to me. They alter my philosophical views of the foundations of mathematics. I am a 
computational formalist - a formalist who admits physical limits to the reliability of any 
verification process, whether by hand or machine. These limits taint even the simplest 
theorems, such as our ability to verify that 1 + 1 = 2 is a consequence of a set of axioms. 
One rogue alpha particle brings all my schemes of perfection to nought. The rigor of 
mathematics and the rehabihty of technology are mutually dependent; math to provide 
ever more accurate models of science, and technology to provide ever more reliable 
execution of mathematical proofs. 
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4 Concluding Remarks 

To everyone who has made it this far in this essay, I highly recommend MacKenzie's 
book IIMacOll . It written by a sociologist with a fine sensitivity to mathematics. The 
author received the Robert K. Merton Award of the American Sociological Association 
in 2003 for this book. 

A few years ago, a special issue of the Notices of the AMS presented a general 
introduction to formal proofs MHalOSI . MHarOSI . OGonOSI . OWieOSI . I also particularly 
recommend the body of research articles by Harrison, Gonthier, and Carette. 

I thank Adams (both Jeff" and Mark), Urban, Carette, Kapulkin, Harrison, and Man- 
fredi for conversations about ideas in this article. 
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